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Abstract 

This paper focuses on two disparate as- 
pects of German syntax from the per- 
spective of paraUel grammar develop- 
ment. As part of a cooperative project, 
we present an innovative approach to 
auxiharies and multiple genitive NPs in 
German. The LFG-based implemen- 
tation presented here avoids unnessary 
structural complexity in the representa- 
tion of auxiliaries by challenging the tra- 
ditional analysis of auxiliaries as raising 
verbs. The approach developed for mul- 
tiple genitive NPs provides a more ab- 
stract, language independent representa- 
tion of genitives associated with nomi- 
nalized verbs. Taken together, the two 
approaches represent a step towards pro- 
viding uniformly applicable treatments 
for differing languages, thus lightening 
the burden for machine translation. 



1 Introduction 

Within the cooperative parallel grammar project 
PARGRAM (IMS-Stuttgart, Xerox-Palo Alto, 
Xerox-Grenoble), the analysis and representation 
of structures in the grammars must be viewed 
from a more global perspective than that of the 
individual languages (German, English, French). 
One major goal of pargram is the development 
of broad coverage grammars which are also mod- 
ular and easy to maintain. Another major goal 
is the construction of parallel analyses for sen- 
tences of the same type in German, English, and 
French. If this can be achieved, the problem faced 
by machine translation (MT) could be greatly re- 
duced. Due to the recent development of a faster 
and more powerful version of the LFG (Lexical- 
Functional-Grammar) based Grammar Writer's 
Workbench (Kaplan and Maxwell 1993) at Xerox, 



the implementation of a linguistically adequate, 
broad coverage grammar appears viable. Given 
the flexible projection-based architecture of LFG 
(Dalrymple et al. 1995) and the MT approach pre- 
sented in Kaplan et al. (1989)]^ a robust MT sys- 
tem is already in place. 

In this paper, we concentrate on two issues 
within the broader perspective of pargram : the 
treatment of auxiliaries and the transparent rep- 
resentation of multiple genitive NPs in German. 
These phenomena represent two areas for which 
generally accepted proposals exist, but whose im- 
plementation in the context of parallel gram- 
mar development throws up questions as to their 
wider, crosslinguistic, feasibility. With respect to 
auxiliaries, the standard raising approach that is 
usually adopted yields undesirable structural com- 
plexity and results in idiosyncratic, language par- 
ticular analyses of the role of auxiliaries. With 
regard to genitive NPs, the standard analysis for 
German yields structures which are too ambigu- 
ous for a succesful application of machine transla- 
tion. The following sections present a solution in 
that morphological wellformedness conditions are 
stated at a separate component, the morphology 
projection. Furthermore, a representation of argu- 
ment structure is implemented that is related to, 
but not identical to the representation of gram- 
matical functions. Language particular idiosyn- 
cratic requirements are thus separated out from 
the language universal information required for 
further semantic interpretation, or machine trans- 
lation. 

2 The Formahsm 

The architecture of LFG assumed here is the 
"traditional" architecture described in Bresnan 
(1982), as well as the newer advances within LFG 



^See also Sadler et al. (1990), Sadler and Thomp- 
son (1991), Kaplan and Wedekind (1993), Butt (1994) 
for further work on MT within LFG. 



(Dalrymple et al. , 1995). A grammar is viewed 
as a set of correspondences expressed in terms of 
projections from one level of representation to an- 
other. Two fundamental levels of representations 
within LFG are the c(onstitutent)-structure and 
the f(unctional)-structure. The c-structure en- 
codes idiosyncratic phrase structural properties of 
a given language, while the f-structure provides 
a language universal representation of grammati- 
cal functions (e.g., SUBject, OBJect), complemen- 
tation, tense, binding, etc. The correspondence 
between c-structure and f-structure is not onto or 
one-to-one, but many-to-one, allowing an abstrac- 
tion over idiosyncratic c-structure properties of a 
language (e.g., discontinuous constituents). 

In addition, several proposals exploring possi- 
ble representations of a s(emantic)-structure have 
been made over the years (e.g. Halvorsen and Ka- 
plan (1988), Dalrymple et al. (1993)). As the re- 
alization of a separate semantic component is only 
planned for the latter stages within pargram, no 
further discussion of possible formalisms will take 
place here. It should be noted, however, that rudi- 
mentary semantic information, such as argument 
structure information (lexical semantics), is en- 
coded within the f-structures in order to facilitate 
transfer in some cases. A case in point is presented 
in the section on German genitive NPs. 

3 Auxiliaries — a flat approach 

3.1 The Received Wisdom 

Auxiliaries have given rise to lively debates con- 
cerning their exact syntactic status (e.g. Chom- 
sky (1957), Ross (1967), PuUum and Wil- 
son (1977), Akmajian et al. (1979), Gazdar 
et al. (1982)): are they simply main verbs 
with special properties, or should they instan- 
tiate a special category Aux? Within current 
lexical approaches (Lexical-Functional-Grammar 
(LFG), Head-driven Phrase Structure Grammar 
(HPSG)), auxiliaries (e.g. have, be) and modals 
(e.g. must, should) are treated as raising verbs, 
which are marked as special in some way: in 
HPSG through an [aux: feature (Pollard and 
Sag 1994), in LFG (Bresnan 1982) by a difference 
in PRED value.0 However, newer work within LFG 
(Bresnan 1995, T.H. King 1995) has been moving 
away from the raising approach towards an analy- 
sis where auxiliaries are elements which contribute 
to the clause only tense/aspect, agreement, or 
voice information, but not a subcategorization 
frame. This view is also in line with approaches 



within GB (Government-Binding) , which see aux- 
iliaries simply as possible instantiations of the 
functional category I (see also Halle and Marantz 
(1993)). 

The "traditional" treatment of auxiliaries in 
both HPSG (Pollard and Sag 1994) and LFG has 
its roots in Ross's (1967) proposal to treat aux- 
iliaries and modals on a par with main verbs. ^ 
In particular, auxiliaries are treated as a sub- 
class of raising verbs (e.g. Pollard and Sag (1994), 
Falk (1984)). For example, a simple sentence like 
(1) would correspond to the c-structure and f- 
structure shown in (2) and (3) , respectively. Note 
that the level of embedding in the f-structure ex- 
actly mirrors the c-structure: each verbal element 
takes a complement. 



(1) 



Der Fahrer wird den Hebel gedreht haben 
the driver will the lever turned gave 
'The driver will have turned the lever.' 



(2) 




V[-haux] 



haben 



^See Falk (1984) for an early LFG treatment of 
'do' in line with that proposed here, and Abeille and 
Godard (1994) for a similar treatment in French. 



•^The term auxiliary has often been taken to sub- 
sume both modals and elements such as have and be. 
However, the distinction between the two is necessary 
not only semantically, but also syntactically. In Ger- 
man and (some dialects of) English modals can be 
stacked, while the distribution of auxiliaries is more 
restricted. Also, assuming that semantic interpreta- 
tion is driven primarily off of the f-structure, the rel- 
ative embedding of modals must be preserved at that 
level in order to allow an interpretation of their scope 
and semantic force. 



(3) 



This is not desirable from a crosslinguistic point 
of view, nor is it helpful for MT. 



PRED 
TENSE 



SUBJ 



XCOMP 




The main reasons to treat auxiliaries as comple- 
ment taking verbs in English are: 1) an account 
of VP-ellipsis, VP-topicalization, etc. follows im- 
mediately; 2) restrictions on the nature of the ver- 
bal complement (progressive, past participle, etc.) 
following the auxiliary can be stated straightfor- 
wardly (PuUum and Wilson (1977), Akmajian et 
al. (1979), Gazdar et al. (1982)). The latter point 
holds for German as well, and in fact, without 
some sort of a hierarchical structure, stating well- 
formedness conditions on a string of multiple aux- 
iliaries becomes wellnigh impossible in light of the 
greater ordering possibilities granted by the flex- 
ible German word order. There are also major 
reasons, however, for not adopting this analysis: 
1) linguistic adequacy; 2) unmotivated structural 
complexity; 3) non-parallel analyses for predica- 
tionally equivalent sentences. Consider the French 
equivalent of (1) in (4). 

(4) 

Le conducteur aura tourne le levier 
the driver will have turned the lever 
'The driver will have turned the lever.' 

As argued by Akmajian et al. (1979), crosshn- 
guistic evidence indicates that elements bearing 
only tense, mood, or voice should belong to a dis- 
tinct syntactic category. In many languages, like 
French or Japanese, the information carried by 
will (future), or have (perfect) is realized morpho- 
logically rather than periphrastically. The analy- 
sis in (4) thus effectively claims that there exists 
a deep difference in the predicational structure of 
auxiliaries like will and have and the French auraj^ 



Alternative Implementation 

approach adopted here is a flat analysis of 
aries at f-structure ((5)). 



PRED 
TENSE 



SUBJ 



'drehen < SUBJ, OBJ > 

FUTPERF 



OBJ 



PRED 


'Fahrer 


CASE 


NOM 


GEND 


MASC 


NUM 


SG 


SPEC 


DEF 


PRED 


'Hebel' 


CASE 


ACC 


GEND 


MASC 


NUM 


SG 


SPEC 


DEF 



The auxiliaries wird 'will' and haben 'have' now 
only contribute information as to the overall tense, 
but do not subcategorize for complements. Struc- 
tural phenomena like VP-ellipsis, coordination, 
or topicalization can, however, still be accounted 
for in terms of an appropriate embedding at c- 
structure (cf. (2)). The role of auxiliaries in nat- 
ural language is now adequately modeled, in par- 
ticular with respect to a more realistic treatment 
of tense (compare (3) and (5)), as the French (4) 
has essentially the same f-structure as (5)J^ 

However, the flat f-structure in (5) provides 
no room for a statement of selectional require- 
ments, allowing massive overgeneration (e.g. noth- 
ing blocks the presence of two haben in (1)). Nei- 
ther can the particular order of auxiliaries be regu- 
lated. Our solution takes advantage of LFG's flex- 
ible projection-based architecture by implement- 
ing a projection which models the hierarchical se- 
lectional requirements of auxiliaries, yet does not 
interfere with the subcategorizational properties 
of verbs, as would be the case under a raising anal- 
ysis. 



*Note that wird 'will' is often analyzed as a modal 
in accordance with Vater (1975). However, the argu- 
ments presented there are not conclusive. 



^The construction of the value for the composed 
tenses results from a complex interaction between the 
lexical entries. Note that this treatment does not as 
yet include a fine-grained represention of tense and 
aspect. This is the subject of ongoing work. The 
treatment presented here provides the basis needed 
for a thorough crosslinguistic analysis of temporal and 
aspectual phenomena. 



(6) 




The dependencies between predicators and their 
arguments and auxiharies and their dependents 
are thus neatly factored out. The m-structure cor- 
responding to the matrix VP in (6) is (7). The 
desired flat f-structure resulting from the usual f 
and I annotations is as in (5). 



haben 



AUX 
FIN 



DEP 



+ 
+ 



AUX 
FIN 

VFORM 
DEP 



BASE 
FIN 

VFORM 



PERFP 



In LFG, the flexible word order of German is 
handled via functional uncertainty, which charac- 
terizes long-distance dependencies without resort- 
ing to movement analyses (Netter (1988), Zaenen 
and Kaplan (1995)). As in (6), which illustrates 
our alternative solution, functional uncertainty is 
represented by the Kleene Star (*)j^ The annota- 
tion on the NPs indicates that they could fulfill the 
role of any possible grammatical function (GF), 
e.g. SUBJ or OBJ, and that the level of embedding 
ranges from zero to infinite. With every auxil- 
iary subcategorizing for an XCOMP, the two NPs 
could conceivably be arguments of three different 
verbs: wird, haben, or gedreht. Thus, the greater 
structural complexity unnecessarily increases the 
search space for the determination of a verb's 
arguments. In (6), however, the m-structure is 
projected from the c-structure parallel to the f- 
structure through annotations similar to the usual 
f-structure annotations .[| Statements about "mor- 
phological" dependents (dep) are thus decoupled 
from functional uncertainty: the relation of NP ar- 
guments to their predicator now does not extend 
through various layers of artificial structural com- 
plexity (xcOMPs). For VP-topicalization or extra- 
position an unbounded long-distance dependency 
must still be assumed. However, as the functional 
uncertainty path for auxiliaries is distributed only 
over the m-structure of the verb complex ((/z t 
DEP*) = I), and does not involve the resolu- 
tion of the role of NP arguments, there are in fact 
differing paths of functional uncertainty involved. 



For space reasons, the xc indicates xcomp, the D 
a DEP. 

^The annotation ^ M* in (6) refers to the m- 
structure associated with the parent c-structure node, 
and fi* refers to the m-structure associated with the 
daughter node. The more familiar ] and [ of LFG 
are simply shorthand notations of the same idea, 
but restricted to the projection from c-structure to 
f-structure: t= 4> M*, \= (j) *. 



Like the f-structure, 
attribute-value matrix. 



the m-structure is an 
It encodes language- 



specific information about idiosyncratic con- 
straints on morphological forms. The m-structure 
is not derived from the f-structurc. Rather, both 
representations are in simultaneous correspon- 
dence with the c-structure. The following (ab- 
breviated) lexical entry exemplifies the pieces of 
information needed. The disjunctive lexical en- 
try for wird 'will' in (8) takes the various combi- 
natory possibilities of auxiliaries and main verbs 
into account, and provides the appropriate tense 
feature. For example, it requires that the embed- 
ded VFORM be BASE, and that there be no passive 
involved for a simple future like wird drehen. 

(8) 

wird AUX 

(/i M* AUX) = + 
{ (^ M* DEP VFORM) =C base 
{fl M* DEP DEP VFORM) ^ PERFP 

(t passive) ^ + 

"simple future: wird drehen" 

(t tense) — FUT 

I 

(/i M* DEP VFORM) —C base 

(/i M* DEP DEP VFORM) =C PERFP 

(t passive) 7^ -I- 

"future perfect: wird gedreht haben" 

(t tense) = FUTPERF } 

Features needed only to ensure language par- 
ticular wellformedness are no longer unified into 
the f-structure, cluttering a representation that is 
meant to be language independent. In our analy- 
sis, only features needed for further semantic in- 
terpretation, MT, or for the expression of lan- 
guage universal syntactic generalizations are rep- 
resented at f-structure. For example, morpholog- 
ically encoded information like case, gender, or 



agreement is needed for statements as to bind- 
ing, predicate-argument relations, or the determi- 
nation of complex clause structures (given that 
agreement is generally clause-bounded), and is 
therefore represented at f-structure. Wellformed- 
ness conditions on adjective inflection or relative 
pronoun agreement, however, can now be stated 
on the m-structure as idiosyncratic, language par- 
ticular information which can be ignored for pur- 
poses of MT or semantic interpretation. 

4 Multiple Genitive NPs 

The differing surface realization of genitives 
within NPs in English (preverbal NPs, postver- 
bal PPs), French (postverbal PPs), and German 
(preverbal NPs, postverbal PPs or NPs), poses 
a particular challenge for a parallel grammar de- 
velopment project like pargram. In this pa- 
per, we suggest a treatment of multiple genitive 
NPs which not only accounts for some restrictions 
on their distribution within German, but also al- 
lows a language independent (universal) represen- 
tation, thus facilitating MT. 

In general, the distribution of multiple NPs 
within NPs is an area of German syntax which has 
not received a satisfactory account to date (e.g., 
Pollard and Sag (1994), Bhatt (1990), Haider 
(1988)). In German, nouns generally have at most 
one genitive which may occur in a prenominal or 
postnominal position adjacent to the noun. Both 
kinds of genitives have the same morphological 
shape. However, nominalizations that are derived 
from a transitive verb allow for two genitives, one 
in the prenominal, the other in the postnominal 
position. 

The function of a genitive is generally expressed 
as indicating a possessor: POSS within LFG. How- 
ever, in the case of two genitives, the assignment of 
two POSS values violates the uniqueness-condition 
on f-structures and is furthermore insufficient to 
distinguish the two differing kinds of genitives. We 
therefore propose the utilization of two functions 
named GEnI and GEn2 in order to avoid associa- 
tion with any specific semantic role. Furthermore, 
as genitives in the NP are generally optional, they 
are taken to express no governed functions, i.e., 
they are not subcategorized for by the noun. So 
GEnI and GEn2 are semantic functions in LFG on 
a par with, say, adjuncts. The NP rule for German 
then is (9).0 



(9) NP -> ({DET: T=i 

NP: (t GENl) =J, }) 

N: T=i 

(NP: (T GEN2) =i) 

If the head-noun is not derived from, say, a verb, 
the single genitive in either position is interpreted 
as a possessor. In case of a derived nominal, how- 
ever, a genitive is interpreted according to the the- 
matic roles assigned to the arguments of the verbal 
base. That means the functions GEnI and sc gen2 
have to be linked to the appropriate roles. Neither 
of the two functions is in principle restricted to 
any specific role. But if both genitives are present 
they must be interpreted according to a thematic 
role hierarchy. 

As (10) shows, if only one genitive is present, its 
prenominal interpretation may be as agent or as 
patient. A postnominal (single) genitive is inter- 
preted as agent if the head noun is derived from 
an intransitive, and as a patient/theme if derived 
from a transitive. 



(10) 



Elisabeths Lachen 
Elisabeth-Gen laughing 
'Elisabeth's laughter' 



Roms Belagerung 
Rome- Gen siege 
'Rome's siege' 

However, if two genitives occur, as in (11), the 
prenominal genitive is restricted to an agent, and 
the postnominal one to patient. This restriction 
must be encoded at some level, but does not fol- 
low from the distiction between GEnI and GEn2, 
which are functions that do not bear any semantic 
content on their own. 



(11) 

Karls Behandlung Peters 
Karl-Gen treatment Peter-Gen 
'Karl's treatment of Peter' 

To obtain the correct linking, the argument 
structure of the verbal base must be available. 
Since MT is based on f-structures within PAR- 
GRAM, the argument structure has to be present 
at this level of representation.^ Nominalization 
is therefore implemented as a morphologically 
driven process (lexical rule) which eliminates SUBJ 



Abstracting away from bar-level considerations 
and further optional constituents, this rule captures 
the restrictions that determine the dislocation of a 
genitive in the matrix NP. 



^If a semantic or argument projection is assumed 
(e.g., Halvorsen and Kaplan, 1988), this informa- 
tion may be represented at another independent 
projection. 



and OBJ from the verb's subcategorization frame 
and enters the verb's argument structure into the 
lexical entry of the noun. This yields the option- 
ality of genitives while preserving the underlying 
semantics, as shown in (12). The association of 
GENl and GEN2 then is determined according to 
a hierarchical order of arguments (Bresnan, 1995). 

This approach also provides a means of han- 
dling certain cases of categorial shift. For in- 
stance, in German temporal and conditional ad- 
juncts may be realized as PPs dominating an NP 
headed by a deverbal noun. English does not 
have this option, but employs an adjunct-clause 
instead. Here, the GEnI and GEn2 functions of 
the German f-structure have to be related cor- 
rectly to the SUBJ and obj functions of the English 
f-structure. 



(12) 



bei Karls Darstellung 
at Karl-Gen report 
mussten alle 
must-Past all 
'when Karl reported the accident 
everyone had to laugh' 



des Vorfalls 
the accident- Gen 
lachen 
laugh 



proposed allow the factorization of language par- 
ticular, idiosyncratic information. This results in 
a cleaner treatment of auxiliaries by factoring out 
morphological wellformedness conditions, and al- 
lows for the preservation of argument structure 
information in cases like that of the German mul- 
tiple genitive NP construction, where syntacti- 
cally dissimilar constructions express essentially 
the same predicate-argument relations. As such, 
the work presented here can be seen as a small but 
necessary step towards the realization of a broad 
coverage grammar. In particular, the feasibility 
of developing parallel grammars for differing lan- 
guages is greatly increased through the formula- 
tion of uniformly applicable, transparent analyses. 
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Here the linking of the GEnI and GEn2 func- 
tions to the appropriate thematic role in the Ger- 
man f-structure drives the transfer of these func- 
tions to the SUBJ and OBJ functions of the English 
f-structure. 
(13) 



PRED 



ARG-STR 

GENl 

-gen2 



'Darstellung' 

ARGl AGENT 

arg2 theme 
PRED 'Karl' ] 
PRED 'Vorfair 



PRED 'report < SUBJ, OBJ > ' 
-SUBJ [ PRED 'Karl' ] 
-OBJ PRED 'accident' ] 

Under this approach, languages now only dif- 
fer with respct to the categorial realisation of the 
function by case-marked NP or PP. Thus, an ap- 
plication of this treatment not only provides an 
adequate grammatical analysis of the NP in Ger- 
man, but also facilitates MT. 

5 Conclusion 
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